Skip to content

refactor: bring in types from zarr-metadata#3961

Open
d-v-b wants to merge 65 commits into
zarr-developers:mainfrom
d-v-b:use-zarr-metadata
Open

refactor: bring in types from zarr-metadata#3961
d-v-b wants to merge 65 commits into
zarr-developers:mainfrom
d-v-b:use-zarr-metadata

Conversation

@d-v-b

@d-v-b d-v-b commented May 10, 2026

Copy link
Copy Markdown
Contributor

replaces some of our types with exports from zarr-metadata. I expect a few related PRs, alternating between ones like this (importing types) and ones that add missing types to zarr-metadata.

TODO:

  • Add unit tests and/or doctests in docstrings
  • Add docstrings and API docs for any new/modified user-facing classes and functions
  • New/modified features documented in docs/user-guide/*.md
  • Changes documented as a new file in changes/
  • GitHub Actions have all passed
  • Test coverage is 100% (Codecov passes)

@github-actions github-actions Bot added the needs release notes Automatically applied to PRs which haven't added release notes label May 10, 2026
@d-v-b d-v-b requested a review from ilan-gold May 10, 2026 08:38
@d-v-b

d-v-b commented May 10, 2026

Copy link
Copy Markdown
Contributor Author

cc @chuckwondo

@codecov

codecov Bot commented May 10, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 93.46%. Comparing base (4e79f1b) to head (1f14d06).

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3961      +/-   ##
==========================================
- Coverage   93.47%   93.46%   -0.01%     
==========================================
  Files          90       90              
  Lines       11967    11973       +6     
==========================================
+ Hits        11186    11191       +5     
- Misses        781      782       +1     
Files with missing lines Coverage Δ
src/zarr/codecs/__init__.py 100.00% <100.00%> (ø)
src/zarr/codecs/blosc.py 95.27% <100.00%> (-0.04%) ⬇️
src/zarr/codecs/bytes.py 98.75% <100.00%> (+0.03%) ⬆️
src/zarr/codecs/cast_value.py 98.59% <100.00%> (-0.01%) ⬇️
src/zarr/codecs/crc32c_.py 97.29% <100.00%> (+0.07%) ⬆️
src/zarr/codecs/gzip.py 92.85% <100.00%> (+0.17%) ⬆️
src/zarr/codecs/scale_offset.py 100.00% <100.00%> (ø)
src/zarr/codecs/sharding.py 91.75% <100.00%> (+0.04%) ⬆️
src/zarr/codecs/transpose.py 91.66% <100.00%> (+0.14%) ⬆️
src/zarr/codecs/zstd.py 90.90% <100.00%> (+0.16%) ⬆️
... and 7 more

... and 1 file with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Comment thread pyproject.toml Outdated
@d-v-b d-v-b mentioned this pull request May 11, 2026
d-v-b added 5 commits May 11, 2026 16:39
- set `zarr-metadata` to resolve locally in local development
- add a section to the docs outlining the relationship between zarr and zarr-metadata packages
…e the lower zarr-metadata bound is published
@github-actions github-actions Bot removed the needs release notes Automatically applied to PRs which haven't added release notes label May 13, 2026
@d-v-b

d-v-b commented May 15, 2026

Copy link
Copy Markdown
Contributor Author

we should not merge this until #3972 is sorted out. that PR switches our pre-commit mypy check to run in a locked environment, which can include the package-local copy of zarr-metadata. Without that PR, mypy in pre-commit would run against the version of zarr-metadata on pypi, which would generate false alarms when we pushed pre-release changes to zarr-metadata + zarr-python.

Comment thread packages/zarr-metadata/src/zarr_metadata/v3/codec/bytes.py
d-v-b and others added 2 commits May 21, 2026 19:57
The Slow Hypothesis job checked out shallowly (no tags), so hatch-vcs could
not find the `zarr_metadata-v*` / `v*` tags and built the in-tree
zarr-metadata as `0.1.dev1`. After bumping zarr-python's floor to
`zarr-metadata>=0.2.0`, that stale version no longer satisfies the
constraint, producing a ResolutionImpossible at dependency-sync time:

    Cannot install zarr-metadata 0.1.dev1 ... and zarr==0.1.dev1 because
    these package versions have conflicting dependencies.

Add `fetch-depth: 0` so the workflow grabs all tags, matching test.yml and
the other package-building workflows. hatch-vcs then derives real versions
(zarr 3.x, zarr-metadata 0.2.x) and the floor resolves.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two low-severity findings from the branch review:

- releases.yml: the `verify_pypi_dependency` failure diagnostics wrapped `>=`
  in backticks inside double-quoted echo strings, so bash ran `>=` as command
  substitution — dropping `>=` from the message and creating a stray `=` file.
  Switch to single quotes with the requirement passed as a separate argument.

- core/metadata/v2.py: the "Re-export ... historical name" comment sat above
  an unrelated `parse_separator` import. Move it to the
  `ArrayV2MetadataDict = _ArrayMetadataV2` assignment it actually describes.

The third finding (pyright via uvx possibly not resolving src imports) was
verified as a non-issue: the zarr-metadata pyright CI job passes on the
current branch tip.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@d-v-b

d-v-b commented May 22, 2026

Copy link
Copy Markdown
Contributor Author

this is ready for final review!

key notes for reviewers:

  • nothing should change at runtime
  • some tests have changed because they used invalid metadata
  • the struct dtype used lists instead of tuples for its JSON metadata as declared in zarr-python. That type is changed to use tuples, following our broader metadata convention. I do not think this will affect any consumers.
  • There's a fair number of infrastructure / build changes, which try to ensure that we don't ship a version of zarr that depends on unreleased features in zarr-metadata.

d-v-b and others added 3 commits May 28, 2026 21:02
# Conflicts:
#	tests/test_dtype/test_npy/test_string.py
…#176)

Bumps the actions group with 8 updates in the / directory:

| Package | From | To |
| --- | --- | --- |
| [prefix-dev/setup-pixi](https://github.com/prefix-dev/setup-pixi) | `0.9.5` | `0.9.6` |
| [codecov/codecov-action](https://github.com/codecov/codecov-action) | `6.0.0` | `6.0.1` |
| [github/issue-metrics](https://github.com/github/issue-metrics) | `4.2.2` | `4.2.7` |
| [j178/prek-action](https://github.com/j178/prek-action) | `2.0.3` | `2.0.4` |
| [actions/upload-artifact](https://github.com/actions/upload-artifact) | `7.0.0` | `7.0.1` |
| [actions/download-artifact](https://github.com/actions/download-artifact) | `7.0.0` | `8.0.1` |
| [pypa/gh-action-pypi-publish](https://github.com/pypa/gh-action-pypi-publish) | `1.13.0` | `1.14.0` |
| [zizmorcore/zizmor-action](https://github.com/zizmorcore/zizmor-action) | `0.5.3` | `0.5.6` |



Updates `prefix-dev/setup-pixi` from 0.9.5 to 0.9.6
- [Release notes](https://github.com/prefix-dev/setup-pixi/releases)
- [Commits](prefix-dev/setup-pixi@1b2de7f...5185adf)

Updates `codecov/codecov-action` from 6.0.0 to 6.0.1
- [Release notes](https://github.com/codecov/codecov-action/releases)
- [Changelog](https://github.com/codecov/codecov-action/blob/main/CHANGELOG.md)
- [Commits](codecov/codecov-action@57e3a13...e79a696)

Updates `github/issue-metrics` from 4.2.2 to 4.2.7
- [Release notes](https://github.com/github/issue-metrics/releases)
- [Commits](github-community-projects/issue-metrics@c9e9838...1e38d5e)

Updates `j178/prek-action` from 2.0.3 to 2.0.4
- [Release notes](https://github.com/j178/prek-action/releases)
- [Commits](j178/prek-action@6ad8027...bdca6f1)

Updates `actions/upload-artifact` from 7.0.0 to 7.0.1
- [Release notes](https://github.com/actions/upload-artifact/releases)
- [Commits](actions/upload-artifact@v7...043fb46)

Updates `actions/download-artifact` from 7.0.0 to 8.0.1
- [Release notes](https://github.com/actions/download-artifact/releases)
- [Commits](actions/download-artifact@v7...3e5f45b)

Updates `pypa/gh-action-pypi-publish` from 1.13.0 to 1.14.0
- [Release notes](https://github.com/pypa/gh-action-pypi-publish/releases)
- [Commits](pypa/gh-action-pypi-publish@v1.13.0...cef2210)

Updates `zizmorcore/zizmor-action` from 0.5.3 to 0.5.6
- [Release notes](https://github.com/zizmorcore/zizmor-action/releases)
- [Commits](zizmorcore/zizmor-action@b1d7e1f...5f14fd0)

---
updated-dependencies:
- dependency-name: prefix-dev/setup-pixi
  dependency-version: 0.9.6
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: actions
- dependency-name: codecov/codecov-action
  dependency-version: 6.0.1
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: actions
- dependency-name: github/issue-metrics
  dependency-version: 4.2.7
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: actions
- dependency-name: j178/prek-action
  dependency-version: 2.0.4
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: actions
- dependency-name: actions/upload-artifact
  dependency-version: 7.0.1
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: actions
- dependency-name: actions/download-artifact
  dependency-version: 8.0.1
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: actions
- dependency-name: pypa/gh-action-pypi-publish
  dependency-version: 1.14.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: actions
- dependency-name: zizmorcore/zizmor-action
  dependency-version: 0.5.6
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: actions
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
@d-v-b d-v-b self-assigned this Jun 3, 2026
ls
ls dist

# ---------------------------------------------------------------------------

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add a test similar to https://github.com/zarr-developers/zarr-python/pull/3961/changes#diff-50c86b7ed8ac2cf95bd48334961bf0530cdc77b5a56f852c5c61b89d735fd711R267-R271 that runs the tests against the maximum version allowed from pypi? It feels like that would also catch any bugs present in that version. If there is a test for the minimum zarr_metadata version, how could a user add a feature that hasn't been released yet and also have fully-passing CI?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

after this PR, our test suite will run against the in-tree version of zarr-metadata. that's always going to be the latest version

They must have ``must_understand`` set to ``False``, and may contain
arbitrary additional JSON data.
"""
AllowedExtraField = ExtensionFieldV3

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to deprecate these sorts of aliases? A custom __getattr__ that checks if the caller is trying to import one of these?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think we need to patch the module object for this. lmk if you think that's high enough value in this case -- iirc these types were never exported at the top level

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think we need to patch the module object for this.

Yes, I think so, something like https://github.com/scverse/anndata/blob/b0746ac7db220a3428ab93d3d1d92c1c1b0122c8/src/anndata/_io/__init__.py#L8-L17

d-v-b and others added 22 commits June 3, 2026 16:19
… close, codec order validation

- storage/_utils.py: clamp SuffixByteRequest start to max(0, ...) so
  suffix > len(data) returns all available bytes (B1)
- storage/_logging.py: materialise key_ranges into a list before building
  the log hint string, preventing one-shot generator exhaustion (B2)
- abc/store.py: normalise getsize_prefix argument to end with "/" before
  calling list_prefix, matching delete_dir/is_empty behaviour so sibling
  keys are not over-counted (B3)
- storage/_zip.py: guard ZipStore.close() with an early return when the
  store was never opened, avoiding AttributeError on missing _lock (B4)
- core/codec_pipeline.py: add the missing raise TypeError(msg) in the
  BytesBytesCodec-after-ArrayArrayCodec branch of codecs_from_list (B5)

Add regression tests for all five fixes.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Make each of the five bugs fixed in this PR catchable by a property/stateful
or shared-harness test (verified red-green against the pre-fix source), and
remove the per-store special-case tests whose coverage is now subsumed:

- B1 (suffix clamp): broaden the `key_ranges` strategy to emit
  SuffixByteRequest / OffsetByteRequest / None with bounds that can exceed
  the value length, and give the stateful `get_partial_values` rule an
  independent oracle for every ByteRequest variant. The pure-function unit
  test `TestNormalizeByteRangeIndex` is kept as a fast deterministic guard.
- B2 (generator exhaustion): the shared `StoreTests.test_get_partial_values`
  now passes `key_ranges` as a one-shot generator and asserts one result per
  request, so any store/wrapper that exhausts the iterable early is caught
  across all stores. The logging-only regression test is removed as redundant.
- B3 (getsize_prefix sibling over-counting): the shared
  `StoreTests.test_getsize_prefix` now includes a sibling key ("cc/0") that
  must be excluded, covering every store deterministically. Add a matching
  `getsize_prefix` rule to the store state machine. The memory-only
  regression test is removed as redundant.
- B4 (ZipStore.close before open): replace the two example-based regression
  tests with a stateful lifecycle machine asserting `close()` never raises,
  regardless of open/IO history.
- B5 (missing raise in codec order validation): replace the single example
  with a property test over random {AA,AB,BB} codec orderings whose expected
  outcome (TypeError / ValueError / ok) is modelled independently.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… use-zarr-metadata

# Conflicts:
#	docs/contributing.md
#	src/zarr/core/group.py
#	tests/test_codecs/test_cast_value.py
#	tests/test_codecs/test_scale_offset.py
The check-min-deps-floor hook used `language: system` with `entry:
python ci/check_min_deps_floor.py`, which runs against whatever is on
PATH. Hosts without a bare `python` (Debian/Ubuntu, bare uv-managed
environments) fail the hook with `os error 2` even though the check
itself passes. Switch to `language: python` so pre-commit (and prek in
CI via lint.yml) provisions an interpreter; the script is stdlib-only,
so no extra dependencies are needed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Bump the zarr-metadata floor from 0.2.0 to 0.3.0 (the release that promoted
the curated top-level front door, zarr-developers#4083) and adopt the renamed names.

The only consuming code affected by the 0.3.0 renames is the cast_value codec,
which imported `RoundingMode`/`OutOfRangeMode`; these are now imported under
their 0.3.0 names `CastRoundingMode`/`CastOutOfRangeMode`, aliased back to the
local names so zarr-python's own annotations are unchanged.

Update the dependency floor in [project.dependencies] and the min_deps pin
(enforced equal by check_min_deps_floor.py), refresh the contributing-guide
version references, and correct the stale floor cited in the 3961 changelog
fragment.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Replace zarr-python's hand-defined copies of spec constants/literals with the
canonical definitions now exported at zarr-metadata's top level:

- codec/dtype/chunk-grid name strings in to_dict/from_dict/register_codec
  use the *_CODEC_NAME / *_CHUNK_GRID_NAME constants;
- the Endianness, BloscShuffle, BloscCname, sharding IndexLocation, and
  datetime/timedelta DateTimeUnit literals + their member tuples now come
  from zarr-metadata.

zarr-python's historical names (EndianLiteral, ENDIAN, IndexLocation,
INDEX_LOCATION, BLOSC_SHUFFLE, BLOSC_CNAME, DateTimeUnit, DATETIME_UNIT,
EndiannessStr, ENDIANNESS_STR, ...) are kept as explicit re-exports via
plain assignment, so existing imports keep working.

MemoryOrder (in-memory layout, not the v2 metadata `order` field) and the
chunk-key SeparatorLiteral (used for both v2 and v3, no single canonical
package name) are intentionally left as zarr-python definitions.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The min_deps hatch env must test against the published zarr-metadata floor,
not the in-tree workspace member. With the inherited uv installer, hatch honors
the root `[tool.uv.sources] zarr-metadata = { workspace = true }` and substitutes
the workspace copy, whose hatch-vcs dev version (e.g. 0.3.1.devN) can never
satisfy the `zarr-metadata==<floor>` pin — so env creation failed with
"No solution found" (pre-existing breakage; see pypa/hatch#1639, where hatch's
`uv pip install` invocation interferes with uv workspace handling).

The previous `workspace.members = []` line was a no-op (not a recognized hatch
env key) and never actually opted out. Switch this env to the pip installer,
which ignores `[tool.uv.sources]` entirely, so `zarr-metadata==<floor>` resolves
against the published wheel. Verified: the env now installs zarr-metadata 0.3.0
from PyPI.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants